Analytical continuation of imaginary axis data for optical conductivity 
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We compare difTerent methods for performing analytical continuation of spectral data from the 
imaginary time or frequency axis to the real frequency axis for the optical conductivity a{Lj). We 
compare the maximum entropy (MaxEnt), singular value decomposition (SVD), sampling and Pade 
methods for analytical continuation. We also study two direct methods for obtaining cr(0). For the 
MaxEnt approach we focus on a recent modification. The data are split up in batches, a separate 
MaxEnt calculation is done for each batch and the results are averaged. For the problems studied 
here, we find that typically the SVD, sampling and modified MaxEnt methods give comparable 
accuracy, while the Pade approximation is usually less reliable. 



I. INTRODUCTION 

For strongly correlated systems analytical methods 
usually involve uncontrolled approximations. There- 
fore stochastical methods such as quantum Monte-Carlo 
(QMC)fi quantum cluster methods- or continuous time 
methods^ are often used. Apart from statistical errors, 
such methods can produce quite accurate results, but 
the results are obtained on the imaginary axis. A major 
problem is then the analytically continuing of the results 
to the real axis, which is an ill-posed problem. Small 
changes in the data on the imaginary axis can lead to 
large changes on the real axis. Since the imaginary axis 
data contain statistical noise, the analytical continuation 
is very difhcult. 

There are different ways of regularizing this ill-posed 
problem. One method combines the Bayesian theory 
with the maximum entropy approach (MaxEnt), which 
has been found to be an efficient method for analyt- 
ical continuation.'''^ Other regularizations are used in 
the singular value decomposition (SVD)-!^ or stochastic 
regularization^ methods. An alternative is provided by 
making a Pade approximation to the data as a function of 
imaginary frequency and then analytically continue the 
Pade expression to real frequencies i^iiS A rather differ- 
ent approach is to use sampling methods, where a large 
number of spectra are added, weighted by the proba- 
bility that they correspond to the imaginary axis data. 
Such methods have been proposed for T = Oii and fi- 
nite TJ^ Finally, there are simple approximate methods 
for obtaining the optical conductivity at zero frequency, 
a{u) = 0) directly from imaginary time or frequency data. 

Two-particle correlation functions, such as the dynam- 
ical spin or charge correlation functions or the optical 
conductivity, provide important information about a va- 
riety of properties of the system. These two-particle func- 
tions are much more difficult to calculate in QMC-like 
frameworks than the one-particle Green's function^i^ and 
therefore much of the interest has focused on the elec- 
tron Green's function. Here we therefore instead treat a 
two-particle function, the optical conductivity. While we 
here focus on transformation of QMC data from imagi- 
nary space to real space, we note that there are also QMC 



methods giving results directly for real frequencies 
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In this paper we compare the Pade, SVD, sampling 
and MaxEnt methods for obtaining the optical conduc- 
tivity from imaginary axis data. We define a frequency 
dependent optical conductivity, cr{uj), where w is a real 
frequency. This we refer to as the "exact" result. This 
(j{uj) can easily be transformed to the imaginary axis, 
since this is a well-behaved transformation that can be 
performed with a high accuracy. We add statistical noise 
to the data, which then simulate the output of a QMC 
calculation. The data arc then transformed back to the 
real axis, using the various methods for analytical contin- 
uation. If the methods work well, we should essentially 
recover the starting cr(a;), the "exact" result. This way 
we can judge the accuracy of the different methods. It is 
important to compare with a known "exact" result, since 
analytical continuation methods can give spurious struc- 
tures due to noise in the data. If a certain method A 
gives more structures than another method B, it is hard 
to judge whether these additional structures are real and 
method A is better or they are due to noise and method 
B is better. This problem is avoided if "exact" results 
arc known. Here we construct the "exact" cr(w) using re- 
sults for the two-dimensional Hubbard model as a guide 
for the general shape. 



We find that the SVD, sampling and MaxEnt meth- 
ods tend to give comparable accuracy, while the Pade 
approximation often gives worse results. In particular 
the Pade approximation often overestimates cr(0). One 
of the direct methods for estimating a{0) (based on Eq.[6] 
in Sec. |TT| underestimates cr(0), in particular for a nar- 
row Drude peak, while the other (extrapolating Eq. ([5]) 
in Secinito = 0) typically gives better results. 



In Sec. [IT] we present some general results for the op- 
tical conductivity. The different methods for analytical 
continuation are presented in Sec. Illll and the results are 
show in Sec. [iVl 
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II. OPTICAL CONDUCTIVITY AND 
CURRENT-CURRENT CORRELATION 
FUNCTION 

The optical conductivity a{uj) is obtained from the 
current-current correlation function 



n(r)= 3^(j(r).j(0)}, 



(1) 



where N is the number of sites, j is the current operator, 
j(r) = exp(i7r)jexp(— i/r), r is imaginary time and (...) 
is the thermodynamic average. We then have (setting 

h = kB = 1) 



n(T) 



K{t, uj)a{uj)du! 



where 



K{t, uj) = - 



TT 1 



(2) 



(3) 



is a bosonic kernel and /3 = l/T. Alternatively, we can 
relate (j{uj) to the Fourier transform IV{v) of n(T) 
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FIG. 1: (color on-line) The current-current correlation func- 
tion i^{v) multiplied by as a function of v as well as the 
function ■ko{u) calculated without vertex corrections. Results 
are shown for a large number n/3 = 160 time slices as well 
as for a small = 60, illustrating the resulting poor ac- 
curacy for large u in the latter case. The figure also shows 
7(//) = [ti{Q) — niy)]/ V [Eq. ((S])] and its extrapolation to = 
(thin dotted line), which provides an estimate of (7(0). 
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-a{uj)duj 



1 



;a{uj)du;, 



(4) 



where we have used that a{uj) — a{—io) and v ~ Vi = iv^ 
is a multiple of vq = 2ttT. For large we have that 
tt{i') ^ . This result can also be rewritten as 



n(o) - n(^) 

-aiojjduj = 



V 



7M- (5) 



This provides a convolution of the optical conductivity 
with a Lorentzian with the width v. In particular, if (j{lS) 
has little variation over an energy range of the order of vq , 
7(j/ = Vq) provides an estimate of (t(0). This estimate can 
be improved by extrapolating ^(y) to = 0, as discussed 
below. Alternatively, we can use^'^ 



a(0) 



n(r 



/3 



(6) 



which is accurate if a(y)) has little variation over an en- 
ergy range of the order of 0.69(27rr). 

Typical results for n(i/) are shown in Fig. [T] for a two- 
dimensional (2d) Hubbard model on a square lattice with 
nearest [t = —0.4 eV) and second nearest [t — 0.12 eV) 
hopping. The Coulomb interaction is ?7 = 3.2 eV and 
/? = 15 eV^^. This gives the occupancy 0.95. To obtain 
a conductivity, we have assumed that such 2d sheets are 
stacked on top of each other with a distance c = 6.6 A 
appropriate for La2_a;Sr2;Cu04. The data were obtained 
from a calculation in the dynamical cluster approxima- 
tion (DCA)^ using a cluster with eight sites. 



From Eq. ^ we can see that n(z^) 



for large 



V. In Fig. [T] we show n(:/)j/^, which indeed saturates for 
large v. From Eq. we expect this to happen when v 
is much larger than a typical energy scale of cr(Li;), which 
in this case has peaks at w = and w w ±C/ — ±3.2 
eV. In agreement with this, FigU] shows saturation for 
V of the order of 10 eV. For larger values of i^, n(i/) 
essentially just gives information about J u!'^a(uj)du!. The 
figure also shows results for no(i^), which is calculated 
neglecting all vertex corrections. IIq is then obtained 
simply as a product (bubble) of two (dressed) Green's 
functions. Even for large v, H and IIq are different. This 
can be understood from Eq. (U). Although both behave 
as the prefactor is different. 

Fig.[T]also shows [11(0)— n(i/)]/z^ [Eq. ([5])], providing an 
estimate of cr(0) . To improve this estimate we extrapolate 
to = 0. For small values of i^, n(z^) depends mainly on 
a{'w) for small w. We allow for the possibility that a{u!) 
has a Drude like peak at w = by using the Ansatz 



cr(uj) = a + b- 



T/tt 



(7) 



where we have also added a constant a. In Fig. [T] we 
have fitted this form to the results for the lowest three 
non zero values of v. This extrapolation greatly improves 
the estimate, as can be seen from the examples below. 
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III. METHODS 



which satisfy 



A. Pade Approximation 

In the Pade approximation a function f{z) in the com- 
plex plane, z, is described as the ratio between to polyno- 
mials P{z) and Q{z), f{z) = P{z)/Q{z). The function is 
fitted to the output of a QMC calculation so that the re- 
sults for certain imaginary frequencies Vn are reproduced 
exactly. The analytical continuation is then performed 
by evaluating the function on the real axis. In the con- 
text of Green's functions this has in particular been used 
by Vidberg and Serene.— They fit to N data points, using 
a construction which for an even N leads to a polynomial 
Q which is one order higher than P, so that P/Q behaves 
as 1/z for large z. This is appropriate for Green's func- 
tions, considered by them, but not necessarily for the re- 
sponse functions considered here, which behave as l/z^ 
for large z. We have therefore constructed a Pade ap- 
proximation where Q is two orders higher than P, which 
is used in the following. This construction requires N 
to be odd. For the special case considered by Vidberg 
and Serene there are simple formulas for generating the 
polynomials^ while this is somewhat more complicated 
in the general case.^" 

In fitting the P and Q to iV data points, we have used 
data for one negative frequency,— t'o = —2t:T, and the 
TV— 1 lowest nonnegative frequencies. This typically gives 
more stable results than using only nonnegative frequen- 
cies. On the other hand, using positive and negative 
frequencies symmetrically tends to put poles close to the 
real axis and gives poor spectra on the real axis. One 
negative frequency therefore often appears to be a good 
compromise. 



B. Singular value decomposition 

A widely used technique for inverse problems is the sin- 
gular value decomposition (SVD)i^il Here we essentially 
follow Creffield et al.^ except that we work in imaginary 
frequency space rather than in imaginary time space, for 
reasons discussed Sec. IIVI In the SVD method, the real 
frequency space is spanned by a set of eigenvectors. The 
kernel in Eq. is discretized, giving 



OLkV 



(8) 



If the data for different imaginary frequencies Vi are un- 
correlated, as is the case here, we introduce the eigen- 
functions of the operator KK'^ 



j=l 1=1 

We introduce vectors Uk 



(9) 



(10) 



The spectral function can then be expanded as 



k—1 i—l 



(11) 



(12) 



This expansion is very ill-behaved, since some of the 
eigenvalues are very small. The expansion is therefore 
truncated so that only eigenvalues are considered for 
which 



/ai > fJo, 



(13) 



where ai is the largest eigenvalue and (Tq is the accuracy 
of the data. In this way we only consider the n^, eigen- 
vectors with the largest eigenvalues. To further improve 
the method, the kernel K is multiplied by a "support" 
function, which is equal to one in the range where cr(a;) is 
expected to be large and vanishes smoothly outside this 
region. Here we have used the function 1/(1 -|- (w/wq)^), 
where = 5 was used. 



C. Maximum Entropy 

A popular method for analytical continuation is the 
maximum entropy method (MaxEnt);^ This method is 
based on Bayes's theorem^- 



Pr[(7,n] = Pr[cr|H]Pr[H] ^ Pr[H|cr]Pr[cr], 



(14) 



where Pr[(T, H] is the joint probability that the spectral 
function is a{uj) and that the QMC calculation gives the 
correlation function H(j/). While the MaxEnt method 
usually is formulated for imaginary time r, we here for- 
mulate it for imaginary frequency v, for reasons discussed 
in Sec lIVI Pr is the conditional probability that the 
spectral function is a{uj) provided that the correlation 
function H(j/) was obtained from the QMC calculation. 
From this one obtains^^ 



Pr[(T|H] 



Pr[n|cr]Pr[cr] 
Pr[H] 



(15) 



This rewriting converts the ill-posed problem of deter- 
mining a{uj) given H(t/) into the much easier problem 
of determining H(i^) given cr{uj). Pr[H] is a normaliza- 
tion factor, which is independent of cr(cj), and therefore 
is no complication. The remaining issue is then how to 
choose Pr[fT], which represents our prior knowledge about 
ct(w). If we put this probability to a constant and then 
maximize the liklihood function Pr[H|(T] the result is typ- 
ically very bad, resulting in a saw-tooth type of spectra 
In MaxEnt one therefore defines the prior probability in 
terms of a maximum entropy function 



duj{a{Lo) — m{uj) — fj{uj)h\ 



m(uj) 



(16) 
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where m{uj) is a default model. Other definitions are also 
possibleJ^ In the MaxEnt method the quantity 



Pr [n I cr] 



(17) 



is maximized, using an appropriate value for a.- Here we 
have chosen a according to the classic MaxEnt method, 
using a flat prior for ai^ 

We sometimes find that this approach leads to unphysi- 
cal oscillations in a^ui). We have shown that the reason is 
that the MaxEnt method sometimes chooses an a which 
attaches too much significance to the noise in the data. 
This problem can be avoided by using a modification of 
the MaxEnt methodiii^ We split the data for n(^) in sev- 
eral batches and perform a MaxEnt calculation for each 
batch. These results are then averaged. Typically, but 
not always, this leads to better results than averaging the 
data sets and then performing just one MaxEnt calcula- 
tion for the average. 

The choice of default model can influence the outcome 
substantially. Here we have chosen a "reasonable" but 
structureless model (see Sec. IIV[) . Using a model more 
similar to the actual spectrum improves the result. If 
the spectrum is calculated for several T, the result for a 
higher T can be used as the default model for a lower T. 
This can improve the results without introducing undue 
bias. Since we only consider one T here, we have not 
followed that approach. 



D. Sampling method 

The MaxEnt method avoids the saw-tooth problem, 
but the definition of entropy requires the introduction of 
a default model, which can bias the output. Instead we 
can averagei^ over Pr[cr|H] 



crFrffTlHlDa-, 



(18) 



where Da indicates a functional integral over all cr{Lo). 
Pr[cr|H] is given by Eq. (jlSp . where we furthermore put 
Pr[f7] = constant for all nonnegative cr(w). Thus we as- 
sume that this is our only prior knowledge of cr(cj). In 
Ref. [13 we worked in imaginary time space. For reasons 
discussed in sec. lIVI we here work in imaginary frequency 
space. Then the likelihood function is given hy^ 



PrfHlal = 



1 



nr=i(27ra,) 
<exp{-^[H(^,)-n.(^.)]V(2^-)}, 



(19) 



where ct^ is the accuracy of the data H(!/i), and Haii^i 
the transformation of cr(w) to imaginary frequencies. 
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FIG. 2: The optical conductivity for model (|2ip using Fi — 
0.60 and ao = 0.01 according to the Fade (a), the SVD 
method (b), the sampling (c) and the MaxEnt (d) methods 
compared with the exact results. Figs, a) and c) show results 
for different values of the maximum frequency i^max consid- 
ered and Fig. b) for different values of Ui,. Fig. d) shows 
results both for each individual sample (thin lines) and the 
average over all 10 samples as well as the model used. The 
thick line in (b) indicates the optimum value of = 5 and 
the thick line in (c) the largest value of !^max ~ 30 considered 
here. The x in the main part of figure c) shows the estimate 
of o-(O) by extrapolating [11(0) — 11(1/)] /;/ to zero. This is il- 
lustrated by the inset in figure c), where the symbol x gives 
the exact value of cr(0). The symbol o in (d) is the estimate 
of cr(0) based on n(^/2) [Eq. (O]. 
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IV. RESULTS 

To study methods of analytical continuation, we choose 
a model of 17(0;) on the real frequency axis, using calcu- 
lations for a two-dimensional (2d) Hubbard model as a 
guide. Using Eq. (jH), the corresponding n(z^) can easily 
be calculated. This is a well-behaved and stable transfor- 
mation. We generate results for the 60 smallest nonnega- 
tive frequencies. We add random noise to this calculated 

n^{i^^)^U{iy^){l + r^,^), (20) 

where r^^i has a Gaussian distribution with the width 
ctq. This simulates the data that may be obtained from 
a QMC calculation by solving the Bethe-Salpeter equa- 
tion. We generate 10 different sets of data using different 
random numbers for each set. 

In the DCA approach, n(^) is Fourier transformed to 
obtain n(r). This may require knowledge of n(i^) for 
frequencies where the calculation is not very accurate. 
Although this problem can usually be circumvented by 
using the asymptotic behavior of for large v, it then 
seems easier to work directly in v-space. Then if neces- 
sary, we can then decide to use fewer values of than is 
needed to converge the Fourier transform and only use 
values which we believe are accurate. Specifically for 
the present calculation, a Fourier transform to r-space 
would lead to additional complications. Although by 
construction the present 11^ (i/^) has a perfectly Gaus- 
sian noise which is uncorrelated for different values of 
Vi, the Fourier transformed data would have correlation 
between different r-points. Methods working in r-space 
and methods working in i^-space would then have data 
of different quality. To be able to compare all methods 
on an equal footing, we have therefore formulated them 
in imaginary frequency space, which essentially involves 
using kernels appropriate for this space. 

The Ilp,{i>i) data are then analytically continued back 
to the real axis, using methods of interest. Since we know 
the exact result, namely the cr(i^) we started from, we can 
test the accuracy of the methods. 

For the MaxEnt method we analytically continued each 
data set and then took the averageJ^ As discussed above, 
the reason is that the MaxEnt method tends to attach too 
much significance to the noise. The batching method re- 
duces the importance of the noise at the cost of using data 
with a lower accuracy. For the SVD (with the condition 
in Eq. |13p and sampling methods we have not noticed 
any tendency to overemphasizing the noise. Therefore we 
averaged the data before doing the analytical continua- 
tion to get data with the highest possible accuracy. For 
the Fade method with many data points on the imag- 
inary axis there is a strong tendency to overemphasize 
the noise. However, we have not noticed any general 
improvement by "batching" the data, and therefore also 
for the Fade approximation we averaged the data before 
doing the analytical continuation. 



The optical conductivity a{uj) typically has peaks at 
uj — and at approximately uj — iU, where U is the 
Hubbard on-site Coulomb interaction. We therefore use 
the real axis cr(uj) 

Wi W2 

"^"^ = ^i + (../rO^ + i + [(c.-.)/r.p (2^^ 

^l + [(o. + e)/r2]2^ + (u;/r3)6 

Here Fa ^ (Fi,F2) cuts off for large uj. Otherwise 
H(z^) would not decay as as it should. Here we let 
UJ and F; have the unit eV and a the unit (mfJcm)"^. 
Since the smallest nonzero frequency v = uq = 27rT, we 
expect structures on an energy scale much smaller than 

to be described very poorly. Here we use T = 1/15, 
giving i/Q = 0.42. We then choose two different models 
with Fi = 0.30 and 0.6, respectively. For both models 
we use r2 = 1.2, Fs = 4 and e = 3. We use the weights 
Wi = 0.3 and W2 = 0.2. 

Fig. [2] shows results for Fi = 0.6 and data with rel- 
atively good accuracy ctq — 0.01. Fig. ^ shows results 
according to the Fade approximation for different num- 
bers (fmax) of frequencies. For Vrnax — 5 the spectrum 
is rather structureless and the peak at a; = 3 is not well 
described. Since t'max = 5 corresponds to an imaginary 
frequency 1.7, smaller than the energy scale for the struc- 
tures on the real axis, this is not surprising. As t'max 
is increased and more information is added, this peak is 
formed, although at too small energy. The peak at w = 
is also not very well described. 

Fig. 03 shows results according to the SVD method. 
The eigenvalues ak in Eq. ^ are in this case 0.50, 0.12, 
0.048, 0.020, 0.0067, 0.0020, 0.00055, 0.00015... Th e op- 
timal value of n^, according to the criterion in Sec. IIII Bl 
and (To — 0.01 is then 5 and the corresponding results 
are shown by the thick line. Results are also shown by 
thin lines for = 3, 4, 6 and 7. = 3 is too small, 
and misses most of the structures. The values = A 
gives similar results as ti^^ = 5, while = 6 gives some 
unphysical oscillations and — 7 puts in large spurious 
structure, giving too much weight to the noise. 

Fig. ^ shows results from the sampling method. For 
small values of t'max the structures are poorly described, 
and, in particular, the Hubbard peak is placed at a too 
low energy. As discussed for the Fade approximation, 
this is not surprising since only information for small 
imaginary frequencies is used. As t'max is increased the 
description improves. For a large z^max — 30, shown by 
the thick line, the description is rather good. The inset 
shows the quantity ■^{ly) in Eq. (O and the extrapolation 
to v = 0, giving an estimate of (t(0). The symbol x in the 
inset gives the exact result and the x in the main figure 
[2t shows the result estimated from this extrapolation. 
This estimate is in this case somewhat too large. 

Fig. shows the MaxEnt results. Results are shown 
for each of the 10 data sets and also the average of the 
results is shown. Each MaxEnt spectrum shows rather 
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3: The same as Fig. [2] but for Fi = 0.3 (ao = 0.01). 
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FIG. 4: The same as Fig. Mi^i = 0.6) but for ao = 0.001. 



large spurious oscillations due to the method giving too 
much weight to the noise. The average of these spectra, 
however, is rather good. The symbol o in Fig. [2ji also 
shows the estimate in Eq. ^ of cr(0). This estimate is 
accidentally quite good, although the spectrum has sub- 
stantial variations over the range \uj\ < 0.6927rT, and the 



requirement for Eq. ([6]) is not well satisfied. The reason 
is that the noise happens to make this estimate accurate, 
while in Fig|3] with more accurate data the estimate is 
less good. 

Fig. [3]shows results for Fi = 0.3, i.e., a narrower peak 
at w = 0. The SVD, sampling and MaxEnt methods 
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co[eV] 

FIG. 5: The same as Fig. [2]but for Ti = 0.3 and ctq = 0.001. 



give comparable accuracy as in Fig. [21 The accuracy 
of the estimates of cr(0) from Eq. © (o in Fig. [5Ji) is 
worse than in Fig. [2l since the peak at a; = is narrower 
and assumption behind Eq. (jS]) is less well satisfied. The 
estimate from Eq. ([5]) (x in Fig. is of comparable 
accuracy as in Fig. [21 



Fig. [4]shows results for Fi = 0.60, i.e., a broader peak 
as in Fig. [21 but for very accurate data, Uo = 0.001. The 
accuracy of the Fade approximations is now improved, 
as expected. Because of the higher accuracy of the data, 
the optimum has increased from 5 to 7 for the SVD 
method. This leads to an improvement compared with 
Fig. [2}3, although there is a small unphysical oscillation 
at w 1. There is a reduced spread of the thin curves 
in Fig. [4ji, representing the MaxEnt result for each in- 
dividual data set. The average is only marginally im- 
proved. Fig. [51 shows high accuracy data for a narrow 
peak, Fi = 0.3. 

In Fig. [6] we compare the different methods for the 
two different spectra (F = 0.3 and 0.6) and for the two 
accuracies (cto =0.01 and 0.001) considered here. Since 
the value of cr(0) is of particular interest, we show re- 
sults for small lj (< 0.25) in the insets. Typically the 
SVD, sampling and MaxEnt methods are of comparable 
accuracy, while the Fade approximation tends to overes- 
timate (t(0). The differences between the results of these 
methods and the exact result are shown in Fig[71 

V. CORRELATION IN IMAGINARY TIME 

We have so far generated data for imaginary frequen- 
cies and then added Gaussian noise. The noise for differ- 
ent frequencies is uncorrelated and the covariant matrix 



is approximately diagonal. Here ^(^'i) is the average over 
the M samples n(^-'(j/i). If the data are obtained from a 
QMC calculation, C is in general not diagonal. There is 
then a need to make a transformation to a diagonal co- 
variant matrix. Here we follow Jarrell and Gubernatisi^ 
A matrix U is found such that 

C' = U-^CU (23) 

is diagonal. The data and kernel are then transformed 
to the new representation 

k' = U^^K U = U^^U (24) 

and the diagonal elements of C are used to define a new 
likelihood function. The result is that some of the di- 
agonal elements of the covariant matrix are now larger, 
implying less accurate data than one might have thought. 
This does not, however, change the general conclusions 
above. 

VI. CONCLUSIONS 

We have compared different methods for analytically 
continuation of imaginary axis data to real frequencies 
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for the optical conductivity. We transform spectra from 
the real frequency axis to the imaginary axis and add 
statistical noise. These data are then transformed back 
to the real axis using the different analytical continuation 
methods. By comparing with the original spectrum, we 
can compare the accuracy of these methods. Typically, 
these methods have problems if the spectra have features 
on a much smaller energy range than 2t:T. Due to the 
thermal broadening of physical spectra, this may not be 
a serious problem in many cases. Here we have focused 
on two cases where the relevant energy scale, Fi is 0.3 or 
0.6 compared with 2'kT = 0.42. 

We also considered two methods for obtaining cr(0) di- 
rectly, Eq. ^ and extrapolation of 7(^) in Eq. fS]) to 
J/ = 0. The method based on Eq. ^ tends to under- 
estimate cr(0), in particular if a{uj) has a narrow Drude 
peak, while the extrapolation of Eq. ([5]) is typically more 
accurate. 

Calculations for the cases considered in this paper as 
well as for results from DCA typically gives larger values 



for (t(0) in the Pade approximation than from the SVD, 
sampling and MaxEnt approaches. The Pade approxi- 
mation generally tends to give somewhat less accurate 
results than the other three methods. Sometimes un- 
physical results are obtained due to poles close to the 
real axis. The other three methods tend to give results 
of comparable accuracy. We nevertheless find it very use- 
ful to use all three methods. This provides cross checks 
and gives a somewhat better idea about what the true 
spectrum may look like. 
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FIG. 6: Comparison of the different methods for Fi — 0.6 and 
0.3 and for a = 0.01 and 0.001. The insets show a magnified 
view in the range < cj < 0.25. 
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FIG. 7: Difference between the spectrum calculated using one 
of the methods and the exact spectrum for the parameters 
Ti = 0.6 and 0.3 and for a = 0.01 and 0.001. 



